Overview

Dataset statistics

Number of variables26
Number of observations143424
Missing cells543643
Missing cells (%)14.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.5 MiB
Average record size in memory208.0 B

Variable types

Numeric13
Categorical13

Alerts

medical_specialty has a high cardinality: 72 distinct valuesHigh cardinality
primary_diagnosis_code has a high cardinality: 716 distinct valuesHigh cardinality
other_diagnosis_codes has a high cardinality: 19374 distinct valuesHigh cardinality
ndc_code has a high cardinality: 251 distinct valuesHigh cardinality
encounter_id is highly overall correlated with patient_nbrHigh correlation
patient_nbr is highly overall correlated with encounter_idHigh correlation
race is highly imbalanced (56.8%)Imbalance
race has 3309 (2.3%) missing valuesMissing
weight has 139122 (97.0%) missing valuesMissing
payer_code has 54190 (37.8%) missing valuesMissing
medical_specialty has 69463 (48.4%) missing valuesMissing
ndc_code has 23462 (16.4%) missing valuesMissing
max_glu_serum has 136409 (95.1%) missing valuesMissing
A1Cresult has 117650 (82.0%) missing valuesMissing
number_emergency is highly skewed (γ1 = 21.51520047)Skewed
number_outpatient has 120027 (83.7%) zerosZeros
number_inpatient has 96698 (67.4%) zerosZeros
number_emergency has 127444 (88.9%) zerosZeros
num_procedures has 65788 (45.9%) zerosZeros

Reproduction

Analysis started2023-03-27 18:05:39.578175
Analysis finished2023-03-27 18:06:09.244241
Duration29.67 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

encounter_id
Real number (ℝ)

Distinct101766
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6742899 × 108
Minimum12522
Maximum4.4386722 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:09.320844image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum12522
5-th percentile28302468
Q188295964
median1.5476371 × 108
Q32.3208969 × 108
95-th percentile3.7897561 × 108
Maximum4.4386722 × 108
Range4.438547 × 108
Interquartile range (IQR)1.4379372 × 108

Descriptive statistics

Standard deviation1.0190458 × 108
Coefficient of variation (CV)0.60864356
Kurtosis-0.099966293
Mean1.6742899 × 108
Median Absolute Deviation (MAD)70195368
Skewness0.6796994
Sum2.4013336 × 1013
Variance1.0384543 × 1016
MonotonicityNot monotonic
2023-03-27T20:06:09.437689image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58316058 6
 
< 0.1%
63415968 6
 
< 0.1%
63184686 6
 
< 0.1%
110310714 6
 
< 0.1%
60016020 6
 
< 0.1%
205689816 5
 
< 0.1%
174335250 5
 
< 0.1%
197878764 5
 
< 0.1%
184457202 5
 
< 0.1%
377841854 5
 
< 0.1%
Other values (101756) 143369
> 99.9%
ValueCountFrequency (%)
12522 2
< 0.1%
15738 2
< 0.1%
16680 2
< 0.1%
28236 1
 
< 0.1%
35754 1
 
< 0.1%
36900 2
< 0.1%
40926 3
< 0.1%
42570 1
 
< 0.1%
55842 3
< 0.1%
62256 1
 
< 0.1%
ValueCountFrequency (%)
443867222 1
 
< 0.1%
443857166 3
< 0.1%
443854148 2
< 0.1%
443847782 1
 
< 0.1%
443847548 2
< 0.1%
443847176 2
< 0.1%
443842778 1
 
< 0.1%
443842340 1
 
< 0.1%
443842136 1
 
< 0.1%
443842070 1
 
< 0.1%

patient_nbr
Real number (ℝ)

Distinct71518
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54936079
Minimum135
Maximum1.8950262 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:09.571108image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum135
5-th percentile1596573
Q123572188
median46307830
Q388236270
95-th percentile1.1168201 × 108
Maximum1.8950262 × 108
Range1.8950248 × 108
Interquartile range (IQR)64664082

Descriptive statistics

Standard deviation38578400
Coefficient of variation (CV)0.70224159
Kurtosis-0.32759223
Mean54936079
Median Absolute Deviation (MAD)31955810
Skewness0.46901037
Sum7.8791522 × 1012
Variance1.4882929 × 1015
MonotonicityNot monotonic
2023-03-27T20:06:09.695069image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90609804 52
 
< 0.1%
91751121 50
 
< 0.1%
89472402 50
 
< 0.1%
62352252 46
 
< 0.1%
84397842 41
 
< 0.1%
29903877 40
 
< 0.1%
88785891 40
 
< 0.1%
90164655 38
 
< 0.1%
37096866 38
 
< 0.1%
43140906 33
 
< 0.1%
Other values (71508) 142996
99.7%
ValueCountFrequency (%)
135 5
< 0.1%
378 1
 
< 0.1%
729 1
 
< 0.1%
774 2
 
< 0.1%
927 1
 
< 0.1%
1152 5
< 0.1%
1305 1
 
< 0.1%
1314 4
< 0.1%
1629 1
 
< 0.1%
2025 2
 
< 0.1%
ValueCountFrequency (%)
189502619 1
 
< 0.1%
189481478 3
< 0.1%
189445127 4
< 0.1%
189365864 1
 
< 0.1%
189351095 1
 
< 0.1%
189349430 1
 
< 0.1%
189332087 2
< 0.1%
189298877 1
 
< 0.1%
189257846 3
< 0.1%
189215762 1
 
< 0.1%

race
Categorical

IMBALANCE  MISSING 

Distinct5
Distinct (%)< 0.1%
Missing3309
Missing (%)2.3%
Memory size1.1 MiB
Caucasian
107688 
AfricanAmerican
26427 
Hispanic
 
2938
Other
 
2174
Asian
 
888

Length

Max length15
Median length9
Mean length10.023274
Min length5

Characters and Unicode

Total characters1404411
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowCaucasian
3rd rowAfricanAmerican
4th rowCaucasian
5th rowCaucasian

Common Values

ValueCountFrequency (%)
Caucasian 107688
75.1%
AfricanAmerican 26427
 
18.4%
Hispanic 2938
 
2.0%
Other 2174
 
1.5%
Asian 888
 
0.6%
(Missing) 3309
 
2.3%

Length

2023-03-27T20:06:09.813260image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:09.916388image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
caucasian 107688
76.9%
africanamerican 26427
 
18.9%
hispanic 2938
 
2.1%
other 2174
 
1.6%
asian 888
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a 379744
27.0%
i 167306
11.9%
n 164368
11.7%
c 163480
11.6%
s 111514
 
7.9%
C 107688
 
7.7%
u 107688
 
7.7%
r 55028
 
3.9%
A 53742
 
3.8%
e 28601
 
2.0%
Other values (7) 65252
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1237869
88.1%
Uppercase Letter 166542
 
11.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 379744
30.7%
i 167306
13.5%
n 164368
13.3%
c 163480
13.2%
s 111514
 
9.0%
u 107688
 
8.7%
r 55028
 
4.4%
e 28601
 
2.3%
f 26427
 
2.1%
m 26427
 
2.1%
Other values (3) 7286
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
C 107688
64.7%
A 53742
32.3%
H 2938
 
1.8%
O 2174
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1404411
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 379744
27.0%
i 167306
11.9%
n 164368
11.7%
c 163480
11.6%
s 111514
 
7.9%
C 107688
 
7.7%
u 107688
 
7.7%
r 55028
 
3.9%
A 53742
 
3.8%
e 28601
 
2.0%
Other values (7) 65252
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1404411
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 379744
27.0%
i 167306
11.9%
n 164368
11.7%
c 163480
11.6%
s 111514
 
7.9%
C 107688
 
7.7%
u 107688
 
7.7%
r 55028
 
3.9%
A 53742
 
3.8%
e 28601
 
2.0%
Other values (7) 65252
 
4.6%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size1.1 MiB
Female
76185 
Male
67234 

Length

Max length6
Median length6
Mean length5.0624115
Min length4

Characters and Unicode

Total characters726046
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 76185
53.1%
Male 67234
46.9%
(Missing) 5
 
< 0.1%

Length

2023-03-27T20:06:10.019126image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:10.128000image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
female 76185
53.1%
male 67234
46.9%

Most occurring characters

ValueCountFrequency (%)
e 219604
30.2%
a 143419
19.8%
l 143419
19.8%
F 76185
 
10.5%
m 76185
 
10.5%
M 67234
 
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 582627
80.2%
Uppercase Letter 143419
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 219604
37.7%
a 143419
24.6%
l 143419
24.6%
m 76185
 
13.1%
Uppercase Letter
ValueCountFrequency (%)
F 76185
53.1%
M 67234
46.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 726046
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 219604
30.2%
a 143419
19.8%
l 143419
19.8%
F 76185
 
10.5%
m 76185
 
10.5%
M 67234
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 726046
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 219604
30.2%
a 143419
19.8%
l 143419
19.8%
F 76185
 
10.5%
m 76185
 
10.5%
M 67234
 
9.3%

age
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
[70-80)
36928 
[60-70)
32741 
[50-60)
25095 
[80-90)
23527 
[40-50)
13729 
Other values (5)
11404 

Length

Max length8
Median length7
Mean length7.0241103
Min length6

Characters and Unicode

Total characters1007426
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[0-10)
2nd row[10-20)
3rd row[20-30)
4th row[30-40)
5th row[40-50)

Common Values

ValueCountFrequency (%)
[70-80) 36928
25.7%
[60-70) 32741
22.8%
[50-60) 25095
17.5%
[80-90) 23527
16.4%
[40-50) 13729
 
9.6%
[30-40) 4964
 
3.5%
[90-100) 3619
 
2.5%
[20-30) 1927
 
1.3%
[10-20) 733
 
0.5%
[0-10) 161
 
0.1%

Length

2023-03-27T20:06:10.218921image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:10.357904image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
70-80 36928
25.7%
60-70 32741
22.8%
50-60 25095
17.5%
80-90 23527
16.4%
40-50 13729
 
9.6%
30-40 4964
 
3.5%
90-100 3619
 
2.5%
20-30 1927
 
1.3%
10-20 733
 
0.5%
0-10 161
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 290467
28.8%
[ 143424
14.2%
- 143424
14.2%
) 143424
14.2%
7 69669
 
6.9%
8 60455
 
6.0%
6 57836
 
5.7%
5 38824
 
3.9%
9 27146
 
2.7%
4 18693
 
1.9%
Other values (3) 14064
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 577154
57.3%
Open Punctuation 143424
 
14.2%
Dash Punctuation 143424
 
14.2%
Close Punctuation 143424
 
14.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 290467
50.3%
7 69669
 
12.1%
8 60455
 
10.5%
6 57836
 
10.0%
5 38824
 
6.7%
9 27146
 
4.7%
4 18693
 
3.2%
3 6891
 
1.2%
1 4513
 
0.8%
2 2660
 
0.5%
Open Punctuation
ValueCountFrequency (%)
[ 143424
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 143424
100.0%
Close Punctuation
ValueCountFrequency (%)
) 143424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1007426
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 290467
28.8%
[ 143424
14.2%
- 143424
14.2%
) 143424
14.2%
7 69669
 
6.9%
8 60455
 
6.0%
6 57836
 
5.7%
5 38824
 
3.9%
9 27146
 
2.7%
4 18693
 
1.9%
Other values (3) 14064
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1007426
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 290467
28.8%
[ 143424
14.2%
- 143424
14.2%
) 143424
14.2%
7 69669
 
6.9%
8 60455
 
6.0%
6 57836
 
5.7%
5 38824
 
3.9%
9 27146
 
2.7%
4 18693
 
1.9%
Other values (3) 14064
 
1.4%

weight
Categorical

Distinct9
Distinct (%)0.2%
Missing139122
Missing (%)97.0%
Memory size1.1 MiB
[75-100)
1817 
[50-75)
1133 
[100-125)
890 
[125-150)
200 
[25-50)
 
118
Other values (4)
 
144

Length

Max length9
Median length8
Mean length7.9446769
Min length4

Characters and Unicode

Total characters34178
Distinct characters9
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[75-100)
2nd row[50-75)
3rd row[50-75)
4th row[0-25)
5th row[0-25)

Common Values

ValueCountFrequency (%)
[75-100) 1817
 
1.3%
[50-75) 1133
 
0.8%
[100-125) 890
 
0.6%
[125-150) 200
 
0.1%
[25-50) 118
 
0.1%
[0-25) 67
 
< 0.1%
[150-175) 55
 
< 0.1%
[175-200) 18
 
< 0.1%
>200 4
 
< 0.1%
(Missing) 139122
97.0%

Length

2023-03-27T20:06:10.483629image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:10.611926image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
75-100 1817
42.2%
50-75 1133
26.3%
100-125 890
20.7%
125-150 200
 
4.6%
25-50 118
 
2.7%
0-25 67
 
1.6%
150-175 55
 
1.3%
175-200 18
 
0.4%
200 4
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 7031
20.6%
5 5804
17.0%
[ 4298
12.6%
- 4298
12.6%
) 4298
12.6%
1 4125
12.1%
7 3023
8.8%
2 1297
 
3.8%
> 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21280
62.3%
Open Punctuation 4298
 
12.6%
Dash Punctuation 4298
 
12.6%
Close Punctuation 4298
 
12.6%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7031
33.0%
5 5804
27.3%
1 4125
19.4%
7 3023
14.2%
2 1297
 
6.1%
Open Punctuation
ValueCountFrequency (%)
[ 4298
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4298
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4298
100.0%
Math Symbol
ValueCountFrequency (%)
> 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 34178
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7031
20.6%
5 5804
17.0%
[ 4298
12.6%
- 4298
12.6%
) 4298
12.6%
1 4125
12.1%
7 3023
8.8%
2 1297
 
3.8%
> 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7031
20.6%
5 5804
17.0%
[ 4298
12.6%
- 4298
12.6%
) 4298
12.6%
1 4125
12.1%
7 3023
8.8%
2 1297
 
3.8%
> 4
 
< 0.1%

admission_type_id
Real number (ℝ)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0276941
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:10.725363image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4275852
Coefficient of variation (CV)0.70404366
Kurtosis2.0527903
Mean2.0276941
Median Absolute Deviation (MAD)0
Skewness1.5939553
Sum290820
Variance2.0379995
MonotonicityNot monotonic
2023-03-27T20:06:10.808113image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 74713
52.1%
3 27756
 
19.4%
2 26823
 
18.7%
6 7015
 
4.9%
5 6584
 
4.6%
8 488
 
0.3%
7 33
 
< 0.1%
4 12
 
< 0.1%
ValueCountFrequency (%)
1 74713
52.1%
2 26823
 
18.7%
3 27756
 
19.4%
4 12
 
< 0.1%
5 6584
 
4.6%
6 7015
 
4.9%
7 33
 
< 0.1%
8 488
 
0.3%
ValueCountFrequency (%)
8 488
 
0.3%
7 33
 
< 0.1%
6 7015
 
4.9%
5 6584
 
4.6%
4 12
 
< 0.1%
3 27756
 
19.4%
2 26823
 
18.7%
1 74713
52.1%

discharge_disposition_id
Real number (ℝ)

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.6553157
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:10.904768image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile18
Maximum28
Range27
Interquartile range (IQR)2

Descriptive statistics

Standard deviation5.2192793
Coefficient of variation (CV)1.4278601
Kurtosis6.4193987
Mean3.6553157
Median Absolute Deviation (MAD)0
Skewness2.6336162
Sum524260
Variance27.240876
MonotonicityNot monotonic
2023-03-27T20:06:11.001579image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1 85308
59.5%
3 19677
 
13.7%
6 18945
 
13.2%
18 4658
 
3.2%
22 3077
 
2.1%
2 2906
 
2.0%
11 1911
 
1.3%
5 1631
 
1.1%
25 1285
 
0.9%
4 1090
 
0.8%
Other values (16) 2936
 
2.0%
ValueCountFrequency (%)
1 85308
59.5%
2 2906
 
2.0%
3 19677
 
13.7%
4 1090
 
0.8%
5 1631
 
1.1%
6 18945
 
13.2%
7 782
 
0.5%
8 147
 
0.1%
9 29
 
< 0.1%
10 6
 
< 0.1%
ValueCountFrequency (%)
28 200
 
0.1%
27 5
 
< 0.1%
25 1285
 
0.9%
24 65
 
< 0.1%
23 602
 
0.4%
22 3077
2.1%
20 4
 
< 0.1%
19 8
 
< 0.1%
18 4658
3.2%
17 20
 
< 0.1%

admission_source_id
Real number (ℝ)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7010961
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:11.089951image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median7
Q37
95-th percentile17
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.0645318
Coefficient of variation (CV)0.71293866
Kurtosis1.753153
Mean5.7010961
Median Absolute Deviation (MAD)0
Skewness1.033245
Sum817674
Variance16.520419
MonotonicityNot monotonic
2023-03-27T20:06:11.177815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
7 80443
56.1%
1 42773
29.8%
17 9338
 
6.5%
4 4467
 
3.1%
6 3108
 
2.2%
2 1500
 
1.0%
5 1048
 
0.7%
20 247
 
0.2%
3 247
 
0.2%
9 185
 
0.1%
Other values (7) 68
 
< 0.1%
ValueCountFrequency (%)
1 42773
29.8%
2 1500
 
1.0%
3 247
 
0.2%
4 4467
 
3.1%
5 1048
 
0.7%
6 3108
 
2.2%
7 80443
56.1%
8 27
 
< 0.1%
9 185
 
0.1%
10 10
 
< 0.1%
ValueCountFrequency (%)
25 4
 
< 0.1%
22 21
 
< 0.1%
20 247
 
0.2%
17 9338
6.5%
14 2
 
< 0.1%
13 1
 
< 0.1%
11 3
 
< 0.1%
10 10
 
< 0.1%
9 185
 
0.1%
8 27
 
< 0.1%

time_in_hospital
Real number (ℝ)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4901899
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:11.259346image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9996667
Coefficient of variation (CV)0.66804896
Kurtosis0.76214813
Mean4.4901899
Median Absolute Deviation (MAD)2
Skewness1.1019709
Sum644001
Variance8.9980003
MonotonicityNot monotonic
2023-03-27T20:06:11.350770image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 24986
17.4%
2 23475
16.4%
4 20064
14.0%
1 18530
12.9%
5 14452
10.1%
6 10880
7.6%
7 8569
 
6.0%
8 6464
 
4.5%
9 4432
 
3.1%
10 3416
 
2.4%
Other values (4) 8156
 
5.7%
ValueCountFrequency (%)
1 18530
12.9%
2 23475
16.4%
3 24986
17.4%
4 20064
14.0%
5 14452
10.1%
6 10880
7.6%
7 8569
 
6.0%
8 6464
 
4.5%
9 4432
 
3.1%
10 3416
 
2.4%
ValueCountFrequency (%)
14 1526
 
1.1%
13 1807
 
1.3%
12 2116
 
1.5%
11 2707
 
1.9%
10 3416
 
2.4%
9 4432
 
3.1%
8 6464
4.5%
7 8569
6.0%
6 10880
7.6%
5 14452
10.1%

payer_code
Categorical

Distinct17
Distinct (%)< 0.1%
Missing54190
Missing (%)37.8%
Memory size1.1 MiB
MC
46532 
HM
8784 
SP
7613 
BC
6991 
MD
4983 
Other values (12)
14331 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters178468
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowMC
2nd rowMC
3rd rowMC
4th rowMC
5th rowMC

Common Values

ValueCountFrequency (%)
MC 46532
32.4%
HM 8784
 
6.1%
SP 7613
 
5.3%
BC 6991
 
4.9%
MD 4983
 
3.5%
CP 3687
 
2.6%
UN 3665
 
2.6%
CM 2971
 
2.1%
OG 1532
 
1.1%
PO 919
 
0.6%
Other values (7) 1557
 
1.1%
(Missing) 54190
37.8%

Length

2023-03-27T20:06:11.442568image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mc 46532
52.1%
hm 8784
 
9.8%
sp 7613
 
8.5%
bc 6991
 
7.8%
md 4983
 
5.6%
cp 3687
 
4.1%
un 3665
 
4.1%
cm 2971
 
3.3%
og 1532
 
1.7%
po 919
 
1.0%
Other values (7) 1557
 
1.7%

Most occurring characters

ValueCountFrequency (%)
M 64149
35.9%
C 60619
34.0%
P 12341
 
6.9%
H 8992
 
5.0%
S 7692
 
4.3%
B 6991
 
3.9%
D 5740
 
3.2%
U 3665
 
2.1%
N 3665
 
2.1%
O 2611
 
1.5%
Other values (6) 2003
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 178468
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 64149
35.9%
C 60619
34.0%
P 12341
 
6.9%
H 8992
 
5.0%
S 7692
 
4.3%
B 6991
 
3.9%
D 5740
 
3.2%
U 3665
 
2.1%
N 3665
 
2.1%
O 2611
 
1.5%
Other values (6) 2003
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 178468
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 64149
35.9%
C 60619
34.0%
P 12341
 
6.9%
H 8992
 
5.0%
S 7692
 
4.3%
B 6991
 
3.9%
D 5740
 
3.2%
U 3665
 
2.1%
N 3665
 
2.1%
O 2611
 
1.5%
Other values (6) 2003
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 178468
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 64149
35.9%
C 60619
34.0%
P 12341
 
6.9%
H 8992
 
5.0%
S 7692
 
4.3%
B 6991
 
3.9%
D 5740
 
3.2%
U 3665
 
2.1%
N 3665
 
2.1%
O 2611
 
1.5%
Other values (6) 2003
 
1.1%

medical_specialty
Categorical

HIGH CARDINALITY  MISSING 

Distinct72
Distinct (%)0.1%
Missing69463
Missing (%)48.4%
Memory size1.1 MiB
InternalMedicine
20403 
Emergency/Trauma
11595 
Family/GeneralPractice
10508 
Cardiology
7473 
Surgery-General
4387 
Other values (67)
19595 

Length

Max length36
Median length33
Mean length15.970038
Min length6

Characters and Unicode

Total characters1181160
Distinct characters43
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowPediatrics-Endocrinology
2nd rowInternalMedicine
3rd rowInternalMedicine
4th rowFamily/GeneralPractice
5th rowFamily/GeneralPractice

Common Values

ValueCountFrequency (%)
InternalMedicine 20403
 
14.2%
Emergency/Trauma 11595
 
8.1%
Family/GeneralPractice 10508
 
7.3%
Cardiology 7473
 
5.2%
Surgery-General 4387
 
3.1%
Orthopedics 2236
 
1.6%
Nephrology 1918
 
1.3%
Orthopedics-Reconstructive 1867
 
1.3%
Radiologist 1611
 
1.1%
Pulmonology 1334
 
0.9%
Other values (62) 10629
 
7.4%
(Missing) 69463
48.4%

Length

2023-03-27T20:06:11.547047image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
internalmedicine 20403
27.6%
emergency/trauma 11595
15.7%
family/generalpractice 10508
14.2%
cardiology 7473
 
10.1%
surgery-general 4387
 
5.9%
orthopedics 2236
 
3.0%
nephrology 1918
 
2.6%
orthopedics-reconstructive 1867
 
2.5%
radiologist 1611
 
2.2%
pulmonology 1334
 
1.8%
Other values (62) 10629
14.4%

Most occurring characters

ValueCountFrequency (%)
e 149664
12.7%
r 111049
 
9.4%
a 102350
 
8.7%
n 96975
 
8.2%
i 89015
 
7.5%
c 71841
 
6.1%
l 68476
 
5.8%
y 49970
 
4.2%
t 48161
 
4.1%
o 47944
 
4.1%
Other values (33) 345715
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1007746
85.3%
Uppercase Letter 140263
 
11.9%
Other Punctuation 23480
 
2.0%
Dash Punctuation 9671
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 149664
14.9%
r 111049
11.0%
a 102350
10.2%
n 96975
9.6%
i 89015
8.8%
c 71841
7.1%
l 68476
 
6.8%
y 49970
 
5.0%
t 48161
 
4.8%
o 47944
 
4.8%
Other values (13) 172301
17.1%
Uppercase Letter
ValueCountFrequency (%)
M 20991
15.0%
I 20470
14.6%
G 16620
11.8%
P 14767
10.5%
T 12839
9.2%
E 11944
8.5%
F 10524
7.5%
C 8912
6.4%
S 7574
 
5.4%
O 6071
 
4.3%
Other values (7) 9551
6.8%
Other Punctuation
ValueCountFrequency (%)
/ 23435
99.8%
& 45
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 9671
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1148009
97.2%
Common 33151
 
2.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 149664
13.0%
r 111049
 
9.7%
a 102350
 
8.9%
n 96975
 
8.4%
i 89015
 
7.8%
c 71841
 
6.3%
l 68476
 
6.0%
y 49970
 
4.4%
t 48161
 
4.2%
o 47944
 
4.2%
Other values (30) 312564
27.2%
Common
ValueCountFrequency (%)
/ 23435
70.7%
- 9671
29.2%
& 45
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1181160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 149664
12.7%
r 111049
 
9.4%
a 102350
 
8.7%
n 96975
 
8.2%
i 89015
 
7.5%
c 71841
 
6.1%
l 68476
 
5.8%
y 49970
 
4.2%
t 48161
 
4.1%
o 47944
 
4.1%
Other values (33) 345715
29.3%
Distinct716
Distinct (%)0.5%
Missing33
Missing (%)< 0.1%
Memory size1.1 MiB
414
 
9473
428
 
9385
786
 
5432
486
 
5226
410
 
5076
Other values (711)
108799 

Length

Max length6
Median length3
Mean length3.1709522
Min length1

Characters and Unicode

Total characters454686
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)< 0.1%

Sample

1st row250.83
2nd row276
3rd row648
4th row8
5th row197

Common Values

ValueCountFrequency (%)
414 9473
 
6.6%
428 9385
 
6.5%
786 5432
 
3.8%
486 5226
 
3.6%
410 5076
 
3.5%
427 3921
 
2.7%
491 3572
 
2.5%
715 3514
 
2.5%
682 3206
 
2.2%
434 3071
 
2.1%
Other values (706) 91515
63.8%

Length

2023-03-27T20:06:11.654940image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
414 9473
 
6.6%
428 9385
 
6.5%
786 5432
 
3.8%
486 5226
 
3.6%
410 5076
 
3.5%
427 3921
 
2.7%
491 3572
 
2.5%
715 3514
 
2.5%
682 3206
 
2.2%
434 3071
 
2.1%
Other values (706) 91515
63.8%

Most occurring characters

ValueCountFrequency (%)
4 79420
17.5%
2 56849
12.5%
8 53197
11.7%
5 50818
11.2%
1 39934
8.8%
7 39829
8.8%
0 34947
7.7%
6 32215
7.1%
9 28400
 
6.2%
3 24992
 
5.5%
Other values (3) 14085
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 440601
96.9%
Other Punctuation 11699
 
2.6%
Uppercase Letter 2386
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 79420
18.0%
2 56849
12.9%
8 53197
12.1%
5 50818
11.5%
1 39934
9.1%
7 39829
9.0%
0 34947
7.9%
6 32215
7.3%
9 28400
 
6.4%
3 24992
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
V 2384
99.9%
E 2
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 11699
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 452300
99.5%
Latin 2386
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
4 79420
17.6%
2 56849
12.6%
8 53197
11.8%
5 50818
11.2%
1 39934
8.8%
7 39829
8.8%
0 34947
7.7%
6 32215
7.1%
9 28400
 
6.3%
3 24992
 
5.5%
Latin
ValueCountFrequency (%)
V 2384
99.9%
E 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 454686
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 79420
17.5%
2 56849
12.5%
8 53197
11.7%
5 50818
11.2%
1 39934
8.8%
7 39829
8.8%
0 34947
7.7%
6 32215
7.1%
9 28400
 
6.2%
3 24992
 
5.5%
Other values (3) 14085
 
3.1%
Distinct19374
Distinct (%)13.5%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
250|401
 
3637
401|250
 
3060
276|276
 
968
414|250
 
922
428|427
 
911
Other values (19369)
133926 

Length

Max length13
Median length7
Mean length7.2870091
Min length3

Characters and Unicode

Total characters1045132
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7908 ?
Unique (%)5.5%

Sample

1st row?|?
2nd row250.01|255
3rd row250|V27
4th row250.43|403
5th row157|250

Common Values

ValueCountFrequency (%)
250|401 3637
 
2.5%
401|250 3060
 
2.1%
276|276 968
 
0.7%
414|250 922
 
0.6%
428|427 911
 
0.6%
250|272 867
 
0.6%
403|585 805
 
0.6%
276|250 740
 
0.5%
414|401 719
 
0.5%
250|? 689
 
0.5%
Other values (19364) 130106
90.7%

Length

2023-03-27T20:06:11.754415image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
250|401 3637
 
2.5%
401|250 3060
 
2.1%
276|276 968
 
0.7%
414|250 922
 
0.6%
428|427 911
 
0.6%
250|272 867
 
0.6%
403|585 805
 
0.6%
276|250 740
 
0.5%
250 719
 
0.5%
414|401 719
 
0.5%
Other values (19337) 130076
90.7%

Most occurring characters

ValueCountFrequency (%)
2 144879
13.9%
| 143424
13.7%
4 140989
13.5%
5 110514
10.6%
0 105491
10.1%
7 77580
7.4%
8 73866
7.1%
1 72404
6.9%
9 55576
 
5.3%
6 50950
 
4.9%
Other values (5) 69459
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 871198
83.4%
Math Symbol 143424
 
13.7%
Other Punctuation 19917
 
1.9%
Uppercase Letter 10593
 
1.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 144879
16.6%
4 140989
16.2%
5 110514
12.7%
0 105491
12.1%
7 77580
8.9%
8 73866
8.5%
1 72404
8.3%
9 55576
 
6.4%
6 50950
 
5.8%
3 38949
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 17593
88.3%
? 2324
 
11.7%
Uppercase Letter
ValueCountFrequency (%)
V 7706
72.7%
E 2887
 
27.3%
Math Symbol
ValueCountFrequency (%)
| 143424
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1034539
99.0%
Latin 10593
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 144879
14.0%
| 143424
13.9%
4 140989
13.6%
5 110514
10.7%
0 105491
10.2%
7 77580
7.5%
8 73866
7.1%
1 72404
7.0%
9 55576
 
5.4%
6 50950
 
4.9%
Other values (3) 58866
5.7%
Latin
ValueCountFrequency (%)
V 7706
72.7%
E 2887
 
27.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1045132
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 144879
13.9%
| 143424
13.7%
4 140989
13.5%
5 110514
10.6%
0 105491
10.1%
7 77580
7.4%
8 73866
7.1%
1 72404
6.9%
9 55576
 
5.3%
6 50950
 
4.9%
Other values (5) 69459
6.6%

number_outpatient
Real number (ℝ)

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.36242888
Minimum0
Maximum42
Zeros120027
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:11.862526image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2492946
Coefficient of variation (CV)3.4470062
Kurtosis162.7107
Mean0.36242888
Median Absolute Deviation (MAD)0
Skewness9.1746431
Sum51981
Variance1.560737
MonotonicityNot monotonic
2023-03-27T20:06:11.971586image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
0 120027
83.7%
1 11976
 
8.4%
2 5128
 
3.6%
3 2808
 
2.0%
4 1501
 
1.0%
5 749
 
0.5%
6 415
 
0.3%
7 203
 
0.1%
8 133
 
0.1%
9 108
 
0.1%
Other values (29) 376
 
0.3%
ValueCountFrequency (%)
0 120027
83.7%
1 11976
 
8.4%
2 5128
 
3.6%
3 2808
 
2.0%
4 1501
 
1.0%
5 749
 
0.5%
6 415
 
0.3%
7 203
 
0.1%
8 133
 
0.1%
9 108
 
0.1%
ValueCountFrequency (%)
42 1
 
< 0.1%
40 2
 
< 0.1%
39 1
 
< 0.1%
38 1
 
< 0.1%
37 2
 
< 0.1%
36 7
< 0.1%
35 3
< 0.1%
34 1
 
< 0.1%
33 2
 
< 0.1%
29 2
 
< 0.1%

number_inpatient
Real number (ℝ)

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.60085481
Minimum0
Maximum21
Zeros96698
Zeros (%)67.4%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:12.075584image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.207934
Coefficient of variation (CV)2.0103591
Kurtosis21.136635
Mean0.60085481
Median Absolute Deviation (MAD)0
Skewness3.6429012
Sum86177
Variance1.4591044
MonotonicityNot monotonic
2023-03-27T20:06:12.168329image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 96698
67.4%
1 27427
 
19.1%
2 10194
 
7.1%
3 4472
 
3.1%
4 2120
 
1.5%
5 1031
 
0.7%
6 597
 
0.4%
7 334
 
0.2%
8 179
 
0.1%
9 141
 
0.1%
Other values (11) 231
 
0.2%
ValueCountFrequency (%)
0 96698
67.4%
1 27427
 
19.1%
2 10194
 
7.1%
3 4472
 
3.1%
4 2120
 
1.5%
5 1031
 
0.7%
6 597
 
0.4%
7 334
 
0.2%
8 179
 
0.1%
9 141
 
0.1%
ValueCountFrequency (%)
21 1
 
< 0.1%
19 2
 
< 0.1%
18 1
 
< 0.1%
17 1
 
< 0.1%
16 6
 
< 0.1%
15 9
 
< 0.1%
14 14
 
< 0.1%
13 22
 
< 0.1%
12 38
< 0.1%
11 57
< 0.1%

number_emergency
Real number (ℝ)

SKEWED  ZEROS 

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1950859
Minimum0
Maximum76
Zeros127444
Zeros (%)88.9%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:12.264828image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum76
Range76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.92041018
Coefficient of variation (CV)4.7179739
Kurtosis1038.0835
Mean0.1950859
Median Absolute Deviation (MAD)0
Skewness21.5152
Sum27980
Variance0.8471549
MonotonicityNot monotonic
2023-03-27T20:06:12.366660image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
0 127444
88.9%
1 10897
 
7.6%
2 2867
 
2.0%
3 964
 
0.7%
4 491
 
0.3%
5 252
 
0.2%
6 129
 
0.1%
7 88
 
0.1%
8 59
 
< 0.1%
9 48
 
< 0.1%
Other values (23) 185
 
0.1%
ValueCountFrequency (%)
0 127444
88.9%
1 10897
 
7.6%
2 2867
 
2.0%
3 964
 
0.7%
4 491
 
0.3%
5 252
 
0.2%
6 129
 
0.1%
7 88
 
0.1%
8 59
 
< 0.1%
9 48
 
< 0.1%
ValueCountFrequency (%)
76 1
 
< 0.1%
64 1
 
< 0.1%
63 1
 
< 0.1%
54 1
 
< 0.1%
46 3
< 0.1%
42 1
 
< 0.1%
37 3
< 0.1%
29 1
 
< 0.1%
28 1
 
< 0.1%
25 3
< 0.1%

num_lab_procedures
Real number (ℝ)

Distinct118
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.255745
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:12.485173image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q132
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)25

Descriptive statistics

Standard deviation19.657319
Coefficient of variation (CV)0.45444412
Kurtosis-0.23026881
Mean43.255745
Median Absolute Deviation (MAD)13
Skewness-0.25735981
Sum6203912
Variance386.41019
MonotonicityNot monotonic
2023-03-27T20:06:12.607503image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 4656
 
3.2%
43 4011
 
2.8%
44 3608
 
2.5%
45 3387
 
2.4%
38 3175
 
2.2%
46 3107
 
2.2%
40 3086
 
2.2%
41 2990
 
2.1%
47 2981
 
2.1%
42 2928
 
2.0%
Other values (108) 109495
76.3%
ValueCountFrequency (%)
1 4656
3.2%
2 1545
 
1.1%
3 961
 
0.7%
4 532
 
0.4%
5 427
 
0.3%
6 392
 
0.3%
7 420
 
0.3%
8 495
 
0.3%
9 1265
 
0.9%
10 1143
 
0.8%
ValueCountFrequency (%)
132 1
 
< 0.1%
129 1
 
< 0.1%
126 1
 
< 0.1%
121 1
 
< 0.1%
120 2
 
< 0.1%
118 1
 
< 0.1%
114 2
 
< 0.1%
113 6
< 0.1%
111 3
< 0.1%
109 6
< 0.1%

number_diagnoses
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.4244338
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:12.707815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9248717
Coefficient of variation (CV)0.25926175
Kurtosis-0.106884
Mean7.4244338
Median Absolute Deviation (MAD)1
Skewness-0.86753017
Sum1064842
Variance3.7051311
MonotonicityNot monotonic
2023-03-27T20:06:12.791646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
9 69427
48.4%
5 16215
 
11.3%
8 15296
 
10.7%
7 14724
 
10.3%
6 14170
 
9.9%
4 7891
 
5.5%
3 3916
 
2.7%
2 1369
 
1.0%
1 259
 
0.2%
16 63
 
< 0.1%
Other values (6) 94
 
0.1%
ValueCountFrequency (%)
1 259
 
0.2%
2 1369
 
1.0%
3 3916
 
2.7%
4 7891
 
5.5%
5 16215
 
11.3%
6 14170
 
9.9%
7 14724
 
10.3%
8 15296
 
10.7%
9 69427
48.4%
10 22
 
< 0.1%
ValueCountFrequency (%)
16 63
 
< 0.1%
15 15
 
< 0.1%
14 9
 
< 0.1%
13 21
 
< 0.1%
12 10
 
< 0.1%
11 17
 
< 0.1%
10 22
 
< 0.1%
9 69427
48.4%
8 15296
 
10.7%
7 14724
 
10.3%

num_medications
Real number (ℝ)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.776035
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:12.909665image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q111
median15
Q321
95-th percentile32
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.3971304
Coefficient of variation (CV)0.50054322
Kurtosis3.6180322
Mean16.776035
Median Absolute Deviation (MAD)5
Skewness1.3776779
Sum2406086
Variance70.511799
MonotonicityNot monotonic
2023-03-27T20:06:13.032361image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 8399
 
5.9%
15 8315
 
5.8%
12 8227
 
5.7%
14 8069
 
5.6%
16 7888
 
5.5%
11 7756
 
5.4%
17 7097
 
4.9%
10 7028
 
4.9%
18 6635
 
4.6%
9 6351
 
4.4%
Other values (65) 67659
47.2%
ValueCountFrequency (%)
1 262
 
0.2%
2 478
 
0.3%
3 942
 
0.7%
4 1563
 
1.1%
5 2316
 
1.6%
6 3163
2.2%
7 4232
3.0%
8 5432
3.8%
9 6351
4.4%
10 7028
4.9%
ValueCountFrequency (%)
81 2
 
< 0.1%
79 3
 
< 0.1%
75 6
< 0.1%
74 3
 
< 0.1%
72 6
< 0.1%
70 4
 
< 0.1%
69 9
< 0.1%
68 10
< 0.1%
67 14
< 0.1%
66 8
< 0.1%

num_procedures
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3490211
Minimum0
Maximum6
Zeros65788
Zeros (%)45.9%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2023-03-27T20:06:13.136840image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7191041
Coefficient of variation (CV)1.2743345
Kurtosis0.822533
Mean1.3490211
Median Absolute Deviation (MAD)1
Skewness1.3112832
Sum193482
Variance2.9553189
MonotonicityNot monotonic
2023-03-27T20:06:13.211079image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 65788
45.9%
1 29039
20.2%
2 17788
 
12.4%
3 13252
 
9.2%
6 7277
 
5.1%
4 5951
 
4.1%
5 4329
 
3.0%
ValueCountFrequency (%)
0 65788
45.9%
1 29039
20.2%
2 17788
 
12.4%
3 13252
 
9.2%
4 5951
 
4.1%
5 4329
 
3.0%
6 7277
 
5.1%
ValueCountFrequency (%)
6 7277
 
5.1%
5 4329
 
3.0%
4 5951
 
4.1%
3 13252
 
9.2%
2 17788
 
12.4%
1 29039
20.2%
0 65788
45.9%

ndc_code
Categorical

HIGH CARDINALITY  MISSING 

Distinct251
Distinct (%)0.2%
Missing23462
Missing (%)16.4%
Memory size1.1 MiB
68071-1701
20770 
47918-902
20379 
47918-898
6568 
0173-0861
 
4060
50090-0353
 
4040
Other values (246)
64145 

Length

Max length10
Median length9
Mean length9.2123589
Min length9

Characters and Unicode

Total characters1105133
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row68071-1701
2nd row0378-1110
3rd row68071-1701
4th row0049-4110
5th row68071-1701

Common Values

ValueCountFrequency (%)
68071-1701 20770
14.5%
47918-902 20379
 
14.2%
47918-898 6568
 
4.6%
0173-0861 4060
 
2.8%
50090-0353 4040
 
2.8%
0049-4110 3431
 
2.4%
0009-3449 2501
 
1.7%
0173-0863 2305
 
1.6%
0378-1110 2208
 
1.5%
0049-4120 2183
 
1.5%
Other values (241) 51517
35.9%
(Missing) 23462
16.4%

Length

2023-03-27T20:06:13.302294image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
68071-1701 20770
17.3%
47918-902 20379
 
17.0%
47918-898 6568
 
5.5%
0173-0861 4060
 
3.4%
50090-0353 4040
 
3.4%
0049-4110 3431
 
2.9%
0009-3449 2501
 
2.1%
0173-0863 2305
 
1.9%
0378-1110 2208
 
1.8%
0049-4120 2183
 
1.8%
Other values (241) 51517
42.9%

Most occurring characters

ValueCountFrequency (%)
0 198776
18.0%
1 167028
15.1%
- 119962
10.9%
7 112993
10.2%
8 105508
9.5%
9 102969
9.3%
4 80297
7.3%
2 63767
 
5.8%
3 63398
 
5.7%
6 51220
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 985171
89.1%
Dash Punctuation 119962
 
10.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 198776
20.2%
1 167028
17.0%
7 112993
11.5%
8 105508
10.7%
9 102969
10.5%
4 80297
8.2%
2 63767
 
6.5%
3 63398
 
6.4%
6 51220
 
5.2%
5 39215
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 119962
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1105133
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 198776
18.0%
1 167028
15.1%
- 119962
10.9%
7 112993
10.2%
8 105508
9.5%
9 102969
9.3%
4 80297
7.3%
2 63767
 
5.8%
3 63398
 
5.7%
6 51220
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1105133
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 198776
18.0%
1 167028
15.1%
- 119962
10.9%
7 112993
10.2%
8 105508
9.5%
9 102969
9.3%
4 80297
7.3%
2 63767
 
5.8%
3 63398
 
5.7%
6 51220
 
4.6%

max_glu_serum
Categorical

Distinct3
Distinct (%)< 0.1%
Missing136409
Missing (%)95.1%
Memory size1.1 MiB
Norm
3220 
>200
2043 
>300
1752 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters28060
Distinct characters8
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>300
2nd row>300
3rd rowNorm
4th rowNorm
5th rowNorm

Common Values

ValueCountFrequency (%)
Norm 3220
 
2.2%
>200 2043
 
1.4%
>300 1752
 
1.2%
(Missing) 136409
95.1%

Length

2023-03-27T20:06:13.389499image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:13.492088image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
norm 3220
45.9%
200 2043
29.1%
300 1752
25.0%

Most occurring characters

ValueCountFrequency (%)
0 7590
27.0%
> 3795
13.5%
N 3220
11.5%
o 3220
11.5%
r 3220
11.5%
m 3220
11.5%
2 2043
 
7.3%
3 1752
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11385
40.6%
Lowercase Letter 9660
34.4%
Math Symbol 3795
 
13.5%
Uppercase Letter 3220
 
11.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7590
66.7%
2 2043
 
17.9%
3 1752
 
15.4%
Lowercase Letter
ValueCountFrequency (%)
o 3220
33.3%
r 3220
33.3%
m 3220
33.3%
Math Symbol
ValueCountFrequency (%)
> 3795
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 3220
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15180
54.1%
Latin 12880
45.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7590
50.0%
> 3795
25.0%
2 2043
 
13.5%
3 1752
 
11.5%
Latin
ValueCountFrequency (%)
N 3220
25.0%
o 3220
25.0%
r 3220
25.0%
m 3220
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7590
27.0%
> 3795
13.5%
N 3220
11.5%
o 3220
11.5%
r 3220
11.5%
m 3220
11.5%
2 2043
 
7.3%
3 1752
 
6.2%

A1Cresult
Categorical

Distinct3
Distinct (%)< 0.1%
Missing117650
Missing (%)82.0%
Memory size1.1 MiB
>8
13110 
Norm
6955 
>7
5709 

Length

Max length4
Median length2
Mean length2.5396912
Min length2

Characters and Unicode

Total characters65458
Distinct characters7
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>7
2nd row>7
3rd row>7
4th row>8
5th rowNorm

Common Values

ValueCountFrequency (%)
>8 13110
 
9.1%
Norm 6955
 
4.8%
>7 5709
 
4.0%
(Missing) 117650
82.0%

Length

2023-03-27T20:06:13.585000image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:13.692142image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
8 13110
50.9%
norm 6955
27.0%
7 5709
22.2%

Most occurring characters

ValueCountFrequency (%)
> 18819
28.7%
8 13110
20.0%
N 6955
 
10.6%
o 6955
 
10.6%
r 6955
 
10.6%
m 6955
 
10.6%
7 5709
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20865
31.9%
Math Symbol 18819
28.7%
Decimal Number 18819
28.7%
Uppercase Letter 6955
 
10.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6955
33.3%
r 6955
33.3%
m 6955
33.3%
Decimal Number
ValueCountFrequency (%)
8 13110
69.7%
7 5709
30.3%
Math Symbol
ValueCountFrequency (%)
> 18819
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 6955
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37638
57.5%
Latin 27820
42.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 6955
25.0%
o 6955
25.0%
r 6955
25.0%
m 6955
25.0%
Common
ValueCountFrequency (%)
> 18819
50.0%
8 13110
34.8%
7 5709
 
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 65458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
> 18819
28.7%
8 13110
20.0%
N 6955
 
10.6%
o 6955
 
10.6%
r 6955
 
10.6%
m 6955
 
10.6%
7 5709
 
8.7%

change
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
Ch
88669 
No
54755 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters286848
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowCh
3rd rowNo
4th rowCh
5th rowCh

Common Values

ValueCountFrequency (%)
Ch 88669
61.8%
No 54755
38.2%

Length

2023-03-27T20:06:13.773363image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:13.869941image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
ch 88669
61.8%
no 54755
38.2%

Most occurring characters

ValueCountFrequency (%)
C 88669
30.9%
h 88669
30.9%
N 54755
19.1%
o 54755
19.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 143424
50.0%
Lowercase Letter 143424
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 88669
61.8%
N 54755
38.2%
Lowercase Letter
ValueCountFrequency (%)
h 88669
61.8%
o 54755
38.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 286848
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 88669
30.9%
h 88669
30.9%
N 54755
19.1%
o 54755
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 286848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 88669
30.9%
h 88669
30.9%
N 54755
19.1%
o 54755
19.1%

readmitted
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
NO
77248 
>30
50434 
<30
15742 

Length

Max length3
Median length2
Mean length2.4614012
Min length2

Characters and Unicode

Total characters353024
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd row>30
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 77248
53.9%
>30 50434
35.2%
<30 15742
 
11.0%

Length

2023-03-27T20:06:13.950274image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-27T20:06:14.049992image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
no 77248
53.9%
30 66176
46.1%

Most occurring characters

ValueCountFrequency (%)
N 77248
21.9%
O 77248
21.9%
3 66176
18.7%
0 66176
18.7%
> 50434
14.3%
< 15742
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 154496
43.8%
Decimal Number 132352
37.5%
Math Symbol 66176
18.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 77248
50.0%
O 77248
50.0%
Decimal Number
ValueCountFrequency (%)
3 66176
50.0%
0 66176
50.0%
Math Symbol
ValueCountFrequency (%)
> 50434
76.2%
< 15742
 
23.8%

Most occurring scripts

ValueCountFrequency (%)
Common 198528
56.2%
Latin 154496
43.8%

Most frequent character per script

Common
ValueCountFrequency (%)
3 66176
33.3%
0 66176
33.3%
> 50434
25.4%
< 15742
 
7.9%
Latin
ValueCountFrequency (%)
N 77248
50.0%
O 77248
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 353024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 77248
21.9%
O 77248
21.9%
3 66176
18.7%
0 66176
18.7%
> 50434
14.3%
< 15742
 
4.5%

Interactions

2023-03-27T20:06:05.834749image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.033089image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.331442image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.660109image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.093212image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.459848image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.881094image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.527058image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.755577image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.339026image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.893297image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.455705image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.040563image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:05.939106image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.126100image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.428716image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.756767image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.189998image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.553803image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.979246image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.660867image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.885252image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.443328image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.002766image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.563436image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.157373image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.057824image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.234019image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.559589image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.867818image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.298702image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.669746image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.089434image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.870785image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.028411image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.561185image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.127957image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.687528image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.292181image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.169655image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.336114image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.717422image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.977895image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.402068image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.779151image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.203795image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:56.053070image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.170637image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.686625image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.246616image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.815093image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.413399image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.283113image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.439717image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.853714image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.082953image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.504421image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.876979image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.312077image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:56.231741image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.277480image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.794714image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.366109image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.929534image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.524413image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.378690image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.538478image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.979338image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.191591image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.602921image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.983575image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.414481image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:56.622170image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.422592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.903268image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.475509image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.072694image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.643612image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.483958image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.638571image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:48.120018image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.326019image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.707546image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.087198image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.520260image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:56.765568image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.551863image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.021363image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.594030image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.201694image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.759698image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.584168image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.731024image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:48.251515image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.447923image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.804068image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.195644image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.623696image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:56.985491image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.666763image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.124691image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.715952image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.309782image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.878098image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.682130image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.828259image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:48.395182image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.554792image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:51.903198image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.289815image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.757072image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.114204image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.765931image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.267401image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.831515image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.417828image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:04.994442image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.794708image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:46.933158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:48.535469image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.668241image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.010878image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.397417image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:54.926905image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.238968image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.877368image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.390950image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:01.960889image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.544762image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:05.115114image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:06.905467image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.031316image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.305788image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.771811image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.112269image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.506348image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.044450image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.356750image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:58.983413image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.503675image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.090184image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.663890image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:05.228553image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:07.031978image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.130778image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.430090image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.879309image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.235863image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.614726image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.207059image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.500128image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.117328image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.645602image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.219288image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.794470image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:05.352572image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:07.147497image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:47.236638image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:49.553022image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:50.988456image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:52.357719image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:53.766751image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:55.387903image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:57.627785image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:05:59.233937image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:00.772831image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:02.340375image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:03.924632image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-03-27T20:06:05.718921image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-03-27T20:06:14.151963image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
encounter_idpatient_nbradmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalnumber_outpatientnumber_inpatientnumber_emergencynum_lab_proceduresnumber_diagnosesnum_medicationsnum_proceduresracegenderageweightpayer_codemedical_specialtymax_glu_serumA1Cresultchangereadmitted
encounter_id1.0000.530-0.117-0.054-0.050-0.0550.1370.0350.123-0.0050.2900.103-0.0340.0740.0140.0360.0710.0700.1770.1790.1280.1150.074
patient_nbr0.5301.0000.011-0.0360.029-0.0130.1440.0280.1080.0310.2390.042-0.0210.1100.0350.0390.0140.1310.1830.1590.1240.1170.119
admission_type_id-0.1170.0111.0000.033-0.393-0.0070.034-0.038-0.028-0.219-0.1170.1000.2200.0690.0220.0390.1070.1030.2870.1230.0720.0650.041
discharge_disposition_id-0.054-0.0360.0331.0000.0340.2750.0380.0800.0090.0540.1510.1720.0260.0250.0360.0570.0350.0490.1330.0690.0410.0910.115
admission_source_id-0.0500.029-0.3930.0341.000-0.0010.0210.0490.0990.1360.106-0.074-0.2050.0500.0190.0360.1350.0530.2740.1530.0430.0250.055
time_in_hospital-0.055-0.013-0.0070.275-0.0011.000-0.0160.086-0.0010.3390.2350.4660.1970.0140.0380.0420.0400.0370.1000.1460.0250.1130.047
number_outpatient0.1370.1440.0340.0380.021-0.0161.0000.1480.168-0.0260.1080.067-0.0230.0160.0090.0070.0000.0200.0300.0210.0250.0080.029
number_inpatient0.0350.028-0.0380.0800.0490.0860.1481.0000.2180.0300.1330.083-0.0660.0070.0120.0450.0070.0300.0470.0660.0150.0090.124
number_emergency0.1230.108-0.0280.0090.099-0.0010.1680.2181.0000.0010.0890.037-0.0470.0000.0140.0290.0000.0300.0390.0100.0110.0110.031
num_lab_procedures-0.0050.031-0.2190.0540.1360.339-0.0260.0300.0011.0000.1710.2510.0340.0460.0230.0230.0520.0390.1340.1510.0300.0570.030
number_diagnoses0.2900.239-0.1170.1510.1060.2350.1080.1330.0890.1711.0000.2890.0710.0560.0090.1200.0910.0750.1540.0550.1090.0400.084
num_medications0.1030.0420.1000.172-0.0740.4660.0670.0830.0370.2510.2891.0000.3630.0390.0540.0620.0300.0410.1660.1420.0350.2520.064
num_procedures-0.034-0.0210.2200.026-0.2050.197-0.023-0.066-0.0470.0340.0710.3631.0000.0270.0630.0610.0530.0420.2250.0500.0270.0270.038
race0.0740.1100.0690.0250.0500.0140.0160.0070.0000.0460.0560.0390.0271.0000.0730.0990.0380.1050.1300.0280.0600.0220.026
gender0.0140.0350.0220.0360.0190.0380.0090.0120.0140.0230.0090.0540.0630.0731.0000.1060.2170.1080.1540.0000.0380.0210.020
age0.0360.0390.0390.0570.0360.0420.0070.0450.0290.0230.1200.0620.0610.0990.1061.0000.1490.1930.2940.1190.1740.0700.040
weight0.0710.0140.1070.0350.1350.0400.0000.0070.0000.0520.0910.0300.0530.0380.2170.1491.0000.0910.0880.0000.0870.0920.057
payer_code0.0700.1310.1030.0490.0530.0370.0200.0300.0300.0390.0750.0410.0420.1050.1080.1930.0911.0000.1140.1460.1870.0900.065
medical_specialty0.1770.1830.2870.1330.2740.1000.0300.0470.0390.1340.1540.1660.2250.1300.1540.2940.0880.1141.0000.1170.1260.1530.108
max_glu_serum0.1790.1590.1230.0690.1530.1460.0210.0660.0100.1510.0550.1420.0500.0280.0000.1190.0000.1460.1171.0000.3860.2350.056
A1Cresult0.1280.1240.0720.0410.0430.0250.0250.0150.0110.0300.1090.0350.0270.0600.0380.1740.0870.1870.1260.3861.0000.1730.010
change0.1150.1170.0650.0910.0250.1130.0080.0090.0110.0570.0400.2520.0270.0220.0210.0700.0920.0900.1530.2350.1731.0000.034
readmitted0.0740.1190.0410.1150.0550.0470.0290.1240.0310.0300.0840.0640.0380.0260.0200.0400.0570.0650.1080.0560.0100.0341.000

Missing values

2023-03-27T20:06:07.476593image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-27T20:06:08.138157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-27T20:06:09.021400image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtyprimary_diagnosis_codeother_diagnosis_codesnumber_outpatientnumber_inpatientnumber_emergencynum_lab_proceduresnumber_diagnosesnum_medicationsnum_proceduresndc_codemax_glu_serumA1Cresultchangereadmitted
022783928222157CaucasianFemale[0-10)NaN62511NaNPediatrics-Endocrinology250.83?|?00041110NaNNaNNaNNoNO
114919055629189CaucasianFemale[10-20)NaN1173NaNNaN276250.01|25500059918068071-1701NaNNaNCh>30
26441086047875AfricanAmericanFemale[20-30)NaN1172NaNNaN648250|V272101161350378-1110NaNNaNNoNO
350036482442376CaucasianMale[30-40)NaN1172NaNNaN8250.43|40300044716168071-1701NaNNaNChNO
41668042519267CaucasianMale[40-50)NaN1171NaNNaN197157|250000515800049-4110NaNNaNChNO
51668042519267CaucasianMale[40-50)NaN1171NaNNaN197157|2500005158068071-1701NaNNaNChNO
63575482637451CaucasianMale[50-60)NaN2123NaNNaN414411|25000031916647918-902NaNNaNNo>30
75584284259809CaucasianMale[60-70)NaN3124NaNNaN414411|V4500070721135208-001NaNNaNChNO
85584284259809CaucasianMale[60-70)NaN3124NaNNaN414411|V4500070721116729-001NaNNaNChNO
95584284259809CaucasianMale[60-70)NaN3124NaNNaN414411|V4500070721147918-891NaNNaNChNO
encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtyprimary_diagnosis_codeother_diagnosis_codesnumber_outpatientnumber_inpatientnumber_emergencynum_lab_proceduresnumber_diagnosesnum_medicationsnum_proceduresndc_codemax_glu_serumA1Cresultchangereadmitted
14341444384717650375628AfricanAmericanFemale[60-70)NaN1176DMNaN345438|41232145925150090-0353NaNNaNCh>30
143415443847548100162476AfricanAmericanMale[70-80)NaN1373MCNaN250.13291|45800051916042708-009NaN>8Ch>30
143416443847548100162476AfricanAmericanMale[70-80)NaN1373MCNaN250.13291|45800051916068071-1701NaN>8Ch>30
14341744384778274694222AfricanAmericanFemale[80-90)NaN1455MCNaN560276|78701033918368071-1701NaNNaNNoNO
14341844385414841088789CaucasianMale[70-80)NaN1171MCNaN38590|29610053139010631-019NaNNaNChNO
14341944385414841088789CaucasianMale[70-80)NaN1171MCNaN38590|29610053139047918-902NaNNaNChNO
14342044385716631693671CaucasianFemale[80-90)NaN23710MCSurgery-General996285|9980104592120049-4110NaNNaNChNO
14342144385716631693671CaucasianFemale[80-90)NaN23710MCSurgery-General996285|9980104592120781-5421NaNNaNChNO
14342244385716631693671CaucasianFemale[80-90)NaN23710MCSurgery-General996285|99801045921247918-902NaNNaNChNO
143423443867222175429310CaucasianMale[70-80)NaN1176NaNNaN530530|78700013933NaNNaNNaNNoNO